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Abstract 

So-called combined approaches answer a conjunctive query 
over a description logic ontology in three steps: first, they ma- 
terialise certain consequences of the ontology and the data; 
second, they evaluate the query over the data; and third, they 
filter the result of the second phase to eliminate unsound an- 
swers. Such approaches were developed for various members 
of the DL-Lite and the £ £ families of languages, but none 
of them can handle ontologies containing nominals. In our 
work, we bridge this gap and present a combined query an- 
swering approach for £j£HO± — a logic that contains all fea- 
tures of the OWL 2 EL standard apart from transitive roles 
and complex role inclusions. This extension is nontrivial be- 
cause nominals require equality reasoning, which introduces 
complexity into the first and the third step. Our empirical 
evaluation suggests that our technique is suitable for practical 
application, and so it provides a practical basis for conjunc- 
tive query answering in a large fragment of OWL 2 EL. 

Introduction 

Description logics (DLs) (Baader et al. 2007) are a family of 
knowledge representation formalisms that underpin OWL 2 
(Cuenca Grau et al. 2008) — an ontology language used in 
advanced information systems with many practical applica- 
tions. Answering conjunctive queries (CQs) over ontology- 
enriched data sets is a core reasoning service in such sys- 
tems, so the computational aspects of this problem have re- 
ceived a lot of interest lately. For expressive DLs, the prob- 
lem is at least doubly exponential in query size (Glimm et 
al. 2008). The problem, however, becomes easier for the £C 
(Baader, Brandt, and Lutz 2005) and the DL-Lite (Calvanese 
et al. 2007) families of DLs, which provide the foundation 
for the OWL 2 EL and the OWL 2 QL profiles of OWL 2. An 
important goal of this research was to devise not only worst- 
case optimal, but also practical algorithms. The known ap- 
proaches can be broadly classified as follows. 

The first group consists of automata-based approaches 
for DLs such as OWL 2 EL (Krotzsch, Rudolph, and Hit- 
zler 2007) and Kom-SUOIQ and Kom-SKOIQ (Or- 
tiz, Rudolph, and Simkus 2011). While worst-case optimal, 
these approaches are typically not suitable for practice since 
their best-case and worst-case performance often coincide. 
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The second group consists of rewri ting-based approaches. 
Roughly speaking, these approaches rewrite the ontology 
and/or the query into another formalism, typically a union 
of conjunctive queries or a datalog program; the relevant 
answers can then be obtained by evaluating the rewriting 
over the data. Rewriting-based approaches were developed 
for members of the DL-Lite family (Calvanese et al. 2007; 
Artale et al. 2009), and the DLs ££HIO ± (Perez-Urbina, 
Motik, and Horrocks 2010) and Horn-SHIQ (Eiter et al. 
2012), to name just a few. A common problem, however, 
is that rewritings can be exponential in the ontology and/or 
query size. Although this is often not a problem in practice, 
such approaches are not worst-case optimal. An exception is 
the algorithm by Rosati (2007) that rewrites an £CT~l± on- 
tology into a datalog program of polynomial size; however, 
the algorithm also uses a nondeterministic step to transform 
the CQ into a tree-shaped one, and it is not clear how to im- 
plement this step in a goal-directed manner. 

The third group consists of combined approaches, which 
use a three-step process: first, they augment the data with 
certain consequences of the ontology; second, they evaluate 
the CQ over the augmented data; and third, they filter the re- 
sult of the second phase to eliminate unsound answers. The 
third step is necessary because, to ensure termination, the 
first step is unsound and may introduce facts that do not fol- 
low from the ontology; however, this is done in a way that 
makes the third step feasible. Such approaches have been de- 
veloped for logics in the DL-Lite (Kontchakov et al. 2011) 
and the £C (Lutz, Toman, and Wolter 2009) families, and 
they are appealing because they are worst-case optimal and 
practical: only the second step is intractable (in query size), 
but it can be solved using well-known database techniques. 

None of the combined approaches proposed thus far, how- 
ever, handles nominals — concepts containing precisely one 
individual. Nominals are included in OWL 2 EL, and they 
are often used to state that all instances of a class have a 
certain property value, such as 'the sex of all men is male', 
or 'each German city is located in Germany' . In this paper 
we present a combined approach for ECHO^ — the DL that 
covers all features of OWL 2 EL apart from transitive roles 
and complex role inclusions. To the best of our knowledge, 
this is the first combined approach that handles nominals. 
Our extension is nontrivial because nominals require equal- 
ity reasoning, which increases the complexity of the first and 



the third step of the algorithm. In particular, nominals may 
introduce recursive dependencies in the filtering conditions 
used in the third phase; this is in contrast to the known com- 
bined approach for £C (Lutz, Toman, and Wolter 2009) in 
which filtering conditions are not recursive and can be incor- 
porated into the input query. To solve this problem, our algo- 
rithm evaluates the original CQ and then uses a polynomial 
function to check the relevant conditions for each answer. 

Following Krotzsch, Rudolph, and Hitzler (2008), instead 
of directly materialising the relevant consequences of the on- 
tology and the data, we transform the ontology into a datalog 
program that captures the relevant consequences. Although 
seemingly just a stylistic issue, a datalog-based specification 
may be beneficial in practice: one can either materialise all 
consequences of the program bottom-up in advance, or one 
can use a top-down technique to compute only the conse- 
quences relevant for the query at hand. The latter can be par- 
ticularly useful in informations systems that have read-only 
access to the data, or where data changes frequently. 

We have implemented a prototypical system using our al- 
gorithm, and we carried out a preliminary empirical evalua- 
tion of (0 the blowup in the number of facts introduced by 
the datalog program, and (;;) the number of unsound answers 
obtained in the second phase. Our experiments show both of 
these numbers to be manageable in typical cases, suggesting 
that our algorithm provides a practical basis for answering 
CQs in an expressive fragment of OWL 2 EL. 

The proofs of our technical results are provided in this 
paper's appendix. 

Preliminaries 

Logic Programming. We use the standard notions of vari- 
ables, constants, function symbols, terms, atoms, formulas, 
and sentences (Fitting 1996). We often identify a conjunc- 
tion with the set of its conjuncts. A substitution a is a par- 
tial mapping of variables to terms; dom(er) and rng(cr) are 
the domain and the range of a, respectively; a\ s is the re- 
striction of a to a set of variables S; and, for a a term or a 
formula, a (a) is the result of simultaneously replacing each 
free variable x occurring in a with a(x). A Horn clause C 
is an expression of the form B\ A ... A B m — > H, where 
H and each Bi are atoms. Such C is a fact if m = 0, and 
it is commonly written as H. Furthermore, C is safe if each 
variable occurring in H also occurs in some Bi. A logic pro- 
gram E is a finite set of safe Horn clauses; furthermore, E is 
a datalog program if each clause in E is function-free. 

In this paper, we interpret a logic program E in a model 
that can be constructed bottom-up. The Herbrand universe 
of E is the set of all terms built from the constants and 
the function symbols occurring in E. Given an arbitrary set 
of facts B, let E(B) be the smallest superset of B such 
that, for each clause (p — > ip e E and each substitution a 
mapping the variables occurring in the clause to the Her- 
brand universe of E, if a(tp) C B, then a{ip) C E(_B). Let 
Jo be the set of all facts occurring in E; for each i e N, let 
Ii + i = £(/»); and let I = {J ieN Ii- Then I is the minimal 
Herbrand model of E, and it is well known that I satisfies 
Vx.C for each Horn clause C <G E and x the vector of all 
variables occurring in C. 



Type 


Axiom 




Clause 


1 


{a} C A 




A(a) 


2 


AHB 


*w 


A(x) -> B(x) 


3 


A C {a} 




A(x) — > x « a 


4 


Ai n A 2 C A 


^> 


A\(x) A A 2 (x) — > A(x) 


5 


3R.Ai C A 


^> 


R(x, v) A A-i (v) — > A(x) 


6 


A 1 C 3R.A 


^> 


Ai(x) -> R(x,f R . A (x)) 
Ai(x) -> A(f R , A (x)) 


7 




^> 


R(x,y) -> S(x, y) 


8 


range(i?, A) 




R(x,y) -> A(y) 



Table 1 : Transforming £CHO r L Axioms into Horn Clauses 

In this paper we allow a logic program E to contain the 
equality predicate w. In first-order logic, w is usually inter- 
preted as the identity over the interpretation domain; how- 
ever, ss can also be explicitly axiomatised (Fitting 1996). 
Let E~ be the set containing clauses (l)-(3), an instance of 
clause (4) for each n-ary predicate R occurring in E and 
each 1 < i < n, and an instance of clause (5) for each n-ary 
function symbol / occurring in E and each 1 < i < n. 
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X 3 Xl « Xz 

; x\ — > R(xi, . . . 
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(4) 
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The minimal Herbrand model of a logic program E that 
contains w is the minimal Herbrand model of E U E~. 

Conjunctive Queries. A conjunctive query (CQ) is a 
formula q = 3y.tp(x,y) with ip a conjunction of function- 
free atoms over variables x U y. Variables x are the answer 
variables of q. Let N T (q) be the set of terms occurring in q. 

Let r be a substitution such that rng(-r) contains only con- 
stants. Then, r(q) = 3z.T(ip), where z is obtained from y 
by removing each variable y G y for which r(y) is defined. 
Note that, according to this definition, non-free variables can 
also be replaced; for example, given q = 3y\,y2-R{y\,yi) 
and t = {y 2 i-> a}, we have r(q) = 3yi.R(yi,a). 

Let E be a logic program, let I be the minimal Herbrand 
model of E, and let q — 3y.ip(x, y) be a CQ that uses only 
the predicates occurring in E. A substitution 7r is a candidate 
answer for q in E if dom(7r) = x and rng(7r) contains only 
constants; furthermore, such a tt is a certain answer to q 
over E, written E |= ir(q), if a substitution r exists such that 
dom(r) = x U y, n = r| and r(g) C I. 

Description Logic. DL ££HO r ± is defined w.r.t. a sig- 
nature consisting of mutually disjoint and countably infi- 
nite sets Nc, Nr, and Nj of atomic concepts (i.e., unary 
predicates), roles (i.e., binary predicates), and individuals 
(i.e., constants), respectively. Furthermore, for each individ- 
ual a £ TYj, expression {a} denotes a nominal — that is, a 
concept containing precisely the individual a. Also, we as- 
sume that T and _L are unary predicates (without any pre- 
defined meaning) not occurring in Nc- We consider only 
normalised knowledge bases, as it is well known (Baader, 
Brandt, and Lutz 2005) that each ££HO r ± knowledge base 
can be normalised in polynomial time without affecting the 
answers to CQs. An £CHO r L TBox is a finite set of ax- 



ioms of the form shown in the left-hand side of Table 1, 
where A (i) e N c U {T}, B E N c U {T, J.}, R, S e AT fl , 
and a € Af/. An ABox A is a finite set of facts constructed 
using the symbols from Nc U {T, _L}, Np, and JVj. Finally, 
an £CHO r L knowledge base (KB) is a tuple /C = (T,A), 
where T is an £CHO r L TBox T and an *4 is an ABox such 
that each predicate occurring in .4 also occurs in T. 

We interpret /C as a logic program. Table 1 shows how to 
translate a TBox T into a logic program £(T). Moreover, 
let T(T) be the set of the following clauses instantiated for 
each atomic concept A and each role R occurring in T. 

A(aO->T(s) i?(x,y)^T(x) y) -> T(y) 

A knowledge base /C = (T, 4) is translated into the logic 
program S(/C) = S(T) U T(T) U A. Then, /C is unsatis- 
fiable if S(/C) |= 3y._L(y). Furthermore, given a conjunc- 
tive query q and a candidate answer it for g, we write 
JC |= 7r(g) iff JC is unsatisfiable or S(/C) |= 7r(q). Although 
somewhat nonstandard, our definitions of DLs are equiva- 
lent to the ones based on the standard denotational semantics 
(Baader et al. 2007). Given a candidate answer 7r for q, de- 
ciding whether S(/C) |= ir(q) holds is NP-complete in com- 
bined complexity, and PTIME-complete in data complexity 
(Krotzsch, Rudolph, and Hitzler 2007). 

Datalog Rewriting of 8 CHOI TBoxes 

For the rest of this section, we fix an arbitrary £CHO r ± 
knowledge base JC = (T, A) . We next show how to trans- 
form JC into a datalog program D(JC) that can be used to 
check the satisfiability of JC. In the following section, we 
then show how to use D(JC) to answer conjunctive queries. 

Due to axioms of type 6 (cf. Table 1), S(/C) may contain 
function symbols and is generally not a datalog program; 
thus, the evaluation of S(/C) may not terminate. To ensure 
termination, we eliminate function symbols from S(X) us- 
ing the technique by Krotzsch, Rudolph, and Hitzler (2008): 
for each A e Nc U {T} and each R <G Np occurring in 
T, we introduce a globally fresh and unique auxiliary in- 
dividual Ofl,A- Intuitively, op.A represents all terms in the 
Herbrand universe of S(/C) needed to satisfy the existential 
concept 3R.A. Krotzsch, Rudolph, and Hitzler (2008) used 
this technique to facilitate taxonomic reasoning, while we 
use it to obtain a practical CQ answering algorithm. Please 
note that or. a depends on both R and A, whereas in the 
known approaches such individuals depend only on A (Lutz, 
Toman, and Wolter 2009) or R (Kontchakov et al. 201 1). 

Definition 1. Datalog program D(T) is obtained by trans- 
lating each axiom of type other than 6 in the TBox T of JC 
into a clause as shown in Table 1, and by translating each 
axiom A\ C 3R.A in T into clauses A\{x) — > R(x, or.a) 
and A\(x) — > A(or^a)- Furthermore, the translation of JC 
into datalog is given by D(/C) = D(T) U T(T) U A. 

Example 1. Let T be the following £CHO r ± TBox: 

KRC C Btaught.JProf Btaught.T IZ Course 
Course C 3ta.ught.Prof {kr} C KRC 

Prof C 3 advisor. Prof KRC C Course 

J Prof Q {john} range(taught,Prof) 




>ai tau9M j » °t,p^ 

r** L ^-<SC advisor 

j „ john i» o T j J 

taught ' , /.ji'/advisor , - J ' it' ( a in advisor*ji» „ 
/ ai 2 f*-'T,pl al 'T r+-'A,P ( T T,pl al " 

k taught,' p(kr ) ' advisor ^ f p - (f (kr)) advjsor ( 

% : 

X ; f AP a°hn) ' 

john ~ f T ,(kr) advlsor ». ' * .ad™, 
/ 'a,p('t,j("0) 

Figure 1: Representing the Models of S(/C). 

Then, D(T) contains the following clauses: 

KRC(x) — »■ taught(x, Ot,j) JProf(x) — > x w jo/m 

KRC(x) — > JProf(o T j) taught(x,y) — > Course(x) 

Course{x) — > taught (x,ot.p) KRC(kr) 

Course(x) — >• Prof(or,p) KRC(x) — > Course(x) 

Prof(x) — > advisor (x, oa. p) taught(x,y) — > Prof(y) 
Prof(x) -■> Prof{o A . P ) 

The following result straightforwardly follows from the 
definition of S(/C) and D(/C). 

Proposition 2. Program D(JC) can be computed in time lin- 
ear in the size oflC. 

Next, we prove that the datalog program D(/C) can be 
used to decide the satisfiability of M To this end, we define 
a function 5 that maps each term w in the Herbrand universe 
of S(/C) to the Herbrand universe of D(/C) as follows: 



5(tu) 



if w € TV/, 



if w is of the form w = fp y A(w'). 



Let / and J be the minimal Herbrand models of S(/C) and 
D(/C), respectively. Mapping (5 establishes a tight relation- 
ship between / and J as illustrated in the following example. 

Example 2. Let A = {Course(ai)}, let T be as in Exam- 
ple 1, and let K, = (T, A). Figure 1 shows a graphical repre- 
sentation of the minimal Herbrand models I and J o/H(/C) 
and D(/C), respectively. The grey dotted lines show how 5 re- 
lates the terms in I to the terms in J. For the sake of clarity, 
Figure 1 does not show the reflexivity of fa. 

Mapping S is a homomorphism from I to J. 

Lemma 3. Let I and J be the minimal Herbrand mod- 
els of S(/C) and D(/C), respectively. Mapping S satisfies 
the following three properties for all terms w' and w, each 
B e Nc U {T, _L}, and each R e N R . 

1. B(w) e I implies B(S(w)) e J. 

2. R(w',w) e I implies R(5(w'),5(w)) e J. 

3. w' ~ w £ I implies S(w') w (5(u>) e J. 

For a similar result in the other direction, we need a couple 
of definitions. Let be an arbitrary Herbrand model. Then, 



dom(iJ) is the set containing each term w that occurs in H 
in at least one fact with a predicate in Nq U {T, _L} U Nr; 
note that, by this definition, we have w £ dom (H) whenever 
w occurs in H only in assertions involving the w predicate. 
Furthermore, aux# is the set of all terms w G Aom(H) such 
that, for each term w' with w w to' G if, we have to' iV}. 
We say that the terms in aux# are 'true' auxiliary terms — 
that is, they are not equal to an individual in Nj. In Figure 
1, bold terms are 'true' auxiliary terms in / and J. 

Lemma 4. Let I and J be the minimal Herbrand models of 
S(/C) and D(/C), respectively. Mapping S satisfies the fol- 
lowing five properties for all terms w\ and w 2 in dom(7), 
each B e N C U {T, _L}, and each R G N R . 

1. B(5(wi)) G J implies that B{wi) G /. 

2. R{8{w\), 8{w 2 )) € J and 5{w 2 ) auxj imply that 
R(w 1 ,w 2 ) G /. 

3. R(S(wi), S(w2)) € J and S(w 2 ) € auxj imply that 
5(w 2 ) is of the form op.a, that R(wi, /p,A(tOi)) € /, ami 
f/zaf a /erm exists swc/z f/zaf R{w' 1 , w 2 ) € /. 

4. <5(wi) w <5(w 2 ) € J and 5{w2) £ auxj imply that 
wi w w 2 G /. 

5. For eac/i ferm u occurring in J, term w G dom(I) exisfs 
smc/i f/iaf <5(to) = m. 

Lemmas 3 and 4 allow us to decide the satisfiability of 
/C by answering a simple query over D(/C), as shown in 
Proposition 5. The complexity claim is due to the fact that 
each clause in D(/C) contains a bounded number of vari- 
ables (Dantsin et al. 2001). 

Proposition 5. For JC an arbitrary £CHO r j_ knowledge 
base, S(/C) |= 3y.J-(y) if and only if D(JC) f= 3y._L(j/). 
Furthermore, the satisfiability of JC can be checked in time 
polynomial in the size ofJC. 

Answering Conjunctive Queries 

In this section, we fix a satisfiable ECHO^ knowledge base 
JC = (T,A) and a conjunctive query q = 3y.ip(x,y). Fur- 
thermore, we fix / and J to be the minimal Herbrand models 
of S(/C) and D(/C), respectively. 

While D(JC) can be used to decide the satisfiability of JC, 
the following example shows that D(JC) cannot be used di- 
rectly to compute the answers to q. 

Example 3. Let JC be as in Example 2, and let q\, q 2 , and 
q 3 be the following conjunctive queries: 

qi = taught{x\,x 2 ) 

q 2 = 3y 1 ,y 2 ,y 3 . taught(x 1 ,y 1 ) A taught{x 2 ,y 2 ) A 
advisor(y 1 ,y 3 ) A advisor (y 2 ,y 3 ) 

q 3 = By. advisor(y,y) 
Furthermore, let tj be the following substitutions: 
T\ = {x\ i y kr, x 2 i ^ ot,p} 
t 2 = {x\ i y kr, x 2 ^t ai, 

yi >-> o T ,p, y 2 i-> o T .p, y 3 ^ o A ,p} 
T3 = {y<-^ o A ,p} 

Finally, let each m be the projection of r, to the answer 
variables of qi. Using Figure 1, one can readily check that 
D(JC) \= Ti{qi), butZ(JC) \f= Wi(qi), for each 1 < i < 3. 



This can be explained by observing that J is a homomor- 
phic image of /. Now homomorphisms preserve CQ answers 
(i.e., S(/C) |= ir(q) implies D(JC) |= n(q)), but they can also 
introduce unsound answers (i.e., D(JC) \= n(q) does not 
necessarily imply E(JC) \= n(q)). This gives rise to the fol- 
lowing notion of spurious answers. 

Definition 6. A substitution r with dom(r) = xUy and 
D(JC) \= r(q) is a spurious answer to q if Y| - is not a certain 
answer to q over S(/C). 

Based on these observations, we answer q over JC in two 
steps: first, we evaluate q over D(JC) and thus obtain an over- 
estimation of the certain answers to q over S(/C); second, for 
each substitution t obtained in the first step, we eliminate 
spurious answers using a special function isSpur. We next 
formally introduce this function. We first present all relevant 
definitions, after which we discuss the intuitions. As we shall 
see, each query in Example 3 illustrates a distinct source of 
spuriousness that our function needs to deal with. 

Definition 7. Let t be a substitution s.t. dom(r) = x U y 
and D(/C) |= r(q). Relation ~ C N T {q) x N T (q) for q, t, 
and D(/C) is the smallest reflexive, symmetric, and transitive 
relation closed under the fork rule, where aux D ( K ) is the set 
containing each individual ufrom D(/C) for which no indi- 
vidual c G Ni exists such that D(/C) |= u w c. 

If \r \ S ' ~ ^ ^( S ' S ') an< ^ 0CCUr ! " 1' 

' S ~ i r(s') G aux D(K) 

Please note that the definition auxo(K;) is actually a refor- 
mulation of the definition of auxj, but based on the conse- 
quences of D(JC) rather than the facts in J. 

Relation ~ is reflexive, symmetric, and transitive, so it is 
an equivalence relation, which allows us to normalise each 
term t G Nx(q) to a representative of its equivalence class 
using the mapping 7 defined below. We then construct a 
graph G aux that checks whether substitution r matches 'true' 
auxiliary individuals in a way that cannot be converted to a 
match over 'true' auxiliary terms in /. 

Definition 8. Let r and ~ be as specified in Definition 7. 
Function 7 : Nx{q) i-> Nx(q) maps each term t G Nx{q) 
to an arbitrary, but fixed representative j(t) of the equiva- 
lence class of ~ that contains t. Furthermore, the directed 
graph Gaux = (V aU x, E aux ) is defined as follows. 

• Set V 3UX contains a vertex 7(f) G Nx(q) for each term 
t G Nr(q) such that r(t) G aux D( x). 

• Set .Eaux contains an edge (7(s)>7(£)) for each atom of 
the form R(s, t) in q such that {'j(s), 7(i)} C V aux . 

Query q is aux-cyclic w.r.t. r and D(JC) if G aux contains a 
cycle; otherwise, q is aux-acyclic w.r.t. r and D(JC). 

We are now ready to define our function that checks 
whether a substitution r is a spurious answer. 

Definition 9. Lef r and ~ fee specified in Definition 7. 
Then, function isSpur(<7, D(/C), r) returns t if and only if at 
least one of the following conditions hold. 

(a) Variable x G x exists such that t(x) £ Nj. 

(b) Terms s and t occurring in q exist such that s <~ t and 
D(JC) £ r(s) « r(t). 



(c) Query q is aux-cyclic w.r.t. r and D(X). 

We next discuss the intuition behind our definitions. We 
ground our discussion in minimal Herbrand models I and 
J, but our technique does not depend on such models: all 
conditions are stated as entailments that can be checked 
using an arbitrary sound and complete technique. Since JC 
is an £CKO T i_ knowledge base, model I is forest-shaped: 
roughly speaking, the role assertions in / that involve at 
least one functional term are of the form R(wi, /p^twi)) 
or R(wi,a) for a € Nj; thus, I can be viewed as a family 
of directed trees whose roots are the individuals in Ni and 
whose edges point from parents to children or to the indi- 
viduals in Nj. This is illustrated in Figure 1, whose lower 
part shows the the forest-model of the knowledge base from 
Example 3. Note that assertions of the form R(wi,a) are 
introduced via equality reasoning. 

Now let t be a substitution such that D(/C) |= r(q), and 
let tt = rL. If r is not a spurious answer, it should be pos- 
sible to convert r into a substitution tt* such that tt = tt*\ s 
and tt* (q) C I. Using the queries from Example 3, we next 
identify three reasons why this may not be possible. 

First, r may map an answer variable of q to an auxiliary 
individual, so by the definition tt cannot be a certain an- 
swer to q; condition (a) of Definition 9 identifies such cases. 
Query qi and substitution n from Example 3 illustrate such 
a situation: t 2 (x 2 ) = Ot,p and ot,p is a 'true' auxiliary in- 
dividual, so tti is not a certain answer to qi . 

The remaining two problems arise because model J is not 
forest-shaped, so r might map q into J in a way that cannot 
be converted into a substitution tt* that maps q into /. 

The second problem is best explained using substitution 
t 2 and query q 2 from Example 3. Query q 2 contains a 'fork' 
advisor(y l7 y 3 ) A advisor(y 2 , y 3 ). Now r 2 (y 3 ) = o A ,p is a 
'true' auxiliary individual, and so it represents 'true' aux- 
iliary terms /a,p(/t,p(<m))> fA,p(fT,p(kr)), and so on. 
Since / is forest-shaped, a match tt^ for q in I obtained 
from t 2 would need to map 2/3 to one of these terms; let 
us assume that n 2 (y 3 ) = /a,p(/t,p(«*))- Since I is forest- 
shaped and fA,p{fr,p{ai)) is a 'true' auxiliary term, this 
means that both yi and y 2 must be mapped to the same term 
(in both J and T). This is captured by the (fork) rule: in our 
example, the rule derives y\ <~ y 2 , and condition (b) of Def- 
inition 9 checks whether r 2 maps y\ and y 2 in a way that 
satisfies this constraint. Note that, due to role hierarchies, 
the rule needs to be applied to atoms R(s, s') and P(t, t') 
with R 7^ P. Moreover, such constraints must be propa- 
gated further up the query. In our example, due to y\ <~ y 2 , 
atoms taught(xi,yx) A taught(x 2 , y 2 ) in q 2 also constitute 
a 'fork', so the rule derives x\ ~ x 2 ; now this allows condi- 
tion (b) of Definition 9 to correctly identify t 2 as spurious. 

The third problem is best explained using substitution t 3 
and query q 3 from Example 3. Model J contains a 'loop' on 
individual oa.p, which allows t 3 to map q 3 into J. In con- 
trast, model / is forest-shaped, and so the 'true' auxiliary 
terms that correspond to oa.p do not form loops. Condition 
(c) of Definition 9 detects such situations using the graph 
G aux . The vertices of G aux correspond to the terms of q that 
are matched to 'true' auxiliary individuals (mapping 7 sim- 
ply ensures that equal terms are represented as one vertex), 
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426144 
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426164 (0.01) 
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(0.01) 
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17945 


17945 




47248 
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17953 (0.04) 


25608 


(0.03) 


76590 


(38.3) 



Table 2: Size of the materialisations. 



and edges of G aux correspond to the role atoms in q. Hence, 
if Gaux is cyclic, then the substitution tt* obtained from r 
would need to match the query q over a cycle of 'true' aux- 
iliary terms, which is impossible since / is forest-shaped. 

Unlike the known combined approaches, our approach 
does not extend q with conditions that detect spurious an- 
swers. Due to nominals, the relevant equality constraints 
have a recursive nature, and they depend on both the sub- 
stitution t and on the previously derived constraints. Con- 
sequently, filtering in our approach is realised as postpro- 
cessing; furthermore, to ensure correctness of our filtering 
condition, auxiliary individuals must depend on both a role 
and an atomic concept. The following theorem proves the 
correctness of our approach. 

Theorem 10. Let JC = (T, A) be a satisfiable ££HO r ± KB, 
let q = 3y.ip(x, y) be a CQ, and let tt : x h- > Ni be a can- 
didate answer for q. Then, S(/C) |= Tr(q) iff a substitution 
t exists such that dom(r) = x U y, t\ s = n, D(JC) \= r{q), 
and isSpur(<7, D(/C), t) = f. 

Furthermore, isSpur(g, D(/C), r) can be evaluated in poly- 
nomial time, so the main source of complexity in our ap- 
proach is in deciding whether D(/C) \= r(q) holds. This 
gives rise to the following result. 

Theorem 11. Deciding whether JC |= Tr(q) holds can be im- 
plemented in nondeterministic polynomial time w.r.t. the size 
ofJC and q, and in polynomial time w.r.t. the size of A. 

Evaluation 

To gain insight into the practical applicability of our ap- 
proach, we implemented our technique in a prototypical sys- 
tem. The system uses HermiT, a widely used ontology rea- 
soner, as a datalog engine in order to materialise the conse- 
quences of D(/C) and evaluate q. The system has been im- 
plemented in Java, and we ran our experiments on a Mac- 
Book Pro with 4GB of RAM and an Intel Core 2 Duo 
2.4 Ghz processor. We used two ontologies in our eval- 
uation, details of which are given below. The ontologies, 
queries, and the prototype system are all available online at 
http://www. cs . ox. ac. uk/isg/tools/KARMA/. 

The LSTW benchmark (Lutz et al. 2012) consists of an 
OWL 2 QL version of the LUBM ontology (Guo, Pan, and 
Heflin 2005), queries q[ , ... ,q[i, and a data generator. The 
LSTW ontology extends the standard LUBM ontology with 
several axioms of type 6 (see Table 1). To obtain an £CHO± 
ontology, we removed inverse roles and datatypes, added 1 1 
axioms using 9 freshly introduced nominals, and added one 
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Table 3: Total number of answers and ratio spurious to answers. In Table LSTW, the ratio is stable for each data set. 



axiom of type 4 (see Table 1). These additional axioms re- 
semble the ones in Example 1, and they were designed to 
test equality reasoning. The resulting signature consists of 
132 concepts, 32 roles, and 9 nominals, and the ontology 
contains 180 axioms. From the 11 LSTW queries, we did 
not consider queries q l 4 , q l 6 , q\, and q l n because their result 
sets were empty: q\ relies on existential quantification over 
inverse roles, and the other three are empty already w.r.t. 
the original LSTW ontology. Query q 2 is similar to query q 2 
from Example 3, and it was designed to produce only spu- 
rious answers and thus stress the system. We generated data 
sets with 5, 10 and 20 universities. For each data set, we de- 
note with L-i the knowledge base consisting of our ££HO± 
ontology and the ABox for i universities (see Table 2). 

SEMINTEC is an ontology about financial services de- 
veloped within the SEMINTEC project at the University of 
Poznan. To obtain an £CHO± ontology, we removed in- 
verse roles, role functionality axioms, and universal restric- 
tions, added nine axioms of type 6 (see Table 1), and added 
six axioms using 4 freshly introduced nominals. The result- 
ing ontology signature consists of 60 concepts, 16 roles, 
and 4 nominals, and the ontology contains 173 axioms. 
Queries <?f — g| are tree-shaped queries used in the SEM- 
INTEC project, and we developed queries ourselves. 
Query q| resembles query q l 2 from LSTW, and queries 
and q$ were designed to retrieve a large number of answers 
containing auxiliary individuals, thus stressing condition (a) 
of Definition 9. Finally, the SEMINTEC ontology comes 
with a data set consisting of approximately 65,000 facts con- 
cerning 18,000 individuals (see row SEM in Table 2). 

The practicality of our approach, we believe, is deter- 
mined mainly by the following two factors. First, the num- 
ber of facts involving auxiliary individuals introduced dur- 
ing the materialisation phase should not be 'too large' . Table 
2 shows the materialisation results: the first column shows 
the number of individuals before and after materialisation 
and the percentage of 'true' auxiliary individuals, the sec- 
ond column shows the number of unary facts before and 
after materialisation and the percentage of facts involving 
a 'true' auxiliary individual, and the third column does the 
same for binary facts. As one can see, for each input data 
set, the materialisation step introduces few 'true' auxiliary 
individuals, and the number of facts at most doubles. The 
number of unary facts involving a 'true' auxiliary individual 
does not change with the size of the input data set, whereas 
the number of such binary facts increases by a constant fac- 
tor. This is because, in clauses of type 6, atoms A{or,a) do 
not contain a variable, whereas atoms R(x, or a) do. 

Second, evaluating q over D(/C) should not produce too 



many spurious answers. Table 3 shows the total number of 
answers for each query — that is, the number of answers ob- 
tained by evaluating the query over D(/C); furthermore, the 
table also shows what percentage of these answers are spuri- 
ous. Queries q l 2 , q[ , g|, and gf retrieve a significant percent- 
age of spurious answers. However, only query q l 2 has proven 
to be challenging for our system due to the large number of 
retrieved answers, with an evaluation time of about 40 min- 
utes over the largest knowledge base (L-20). Surprisingly, q[ 
also performed rather poorly despite a low number of spu- 
rious answers, with an evaluation time of about 20 minutes 
for L-20. All other queries were evaluated in at most a few 
seconds, thus suggesting that queries q[ and q 2 are problem- 
atical mainly because HermiT does not implement query op- 
timisation algorithms typically used in relational databases. 

Conclusion 

We presented the first combined technique for answering 
conjunctive queries over DL ontologies that include nomi- 
nals. A preliminary evaluation suggests the following. First, 
the number of materialised facts over 'true' anonymous in- 
dividuals increases by a constant factor with the size of 
the data. Second, query evaluation results have shown that, 
while some cases may be challenging, in most cases the per- 
centage of answers that are spurious is manageable. Hence, 
our technique provides a practical CQ answering algorithm 
for a large fragment of OWL 2 EL. 

We anticipate several directions for our future work. First, 
we would like to investigate the use of top-down query eval- 
uation techniques, such as magic sets (Abiteboul, Hull, and 
Vianu 1995) or SLG resolution (Chen and Warren 1993). 
Second, tighter integration of the detection of spurious an- 
swers with the query evaluation algorithms should make it 
possible to eagerly detect spurious answers (i.e., before the 
query is fully evaluated). Lutz et al. (2012) already imple- 
mented a filtering condition as a user-defined function in a 
database, but it is unclear to what extent such an implemen- 
tation can be used to optimise query evaluation. Finally, we 
would like to extend our approach to all of OWL 2 EL. 
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Additional Proofs 



Proof of Lemma 3 

Lemma 3. Let I and J be the minimal Herbrand models o/S(/C) and D(/C), respectively. Mapping 5 satisfies the following 
three properties for all terms w' and w, each B G N c U {T, _L}, and each R G Nr. 

1. B(w) G I implies B(8(w)) G J. 

2. R{w',w) G I implies R(S(w'),S(w)) G J. 

3. w' ~ w £ I implies S(w / ) w <5(w) G J. 

TVoo/ Let 7 , 7i , • ■ • be the sequence of sets used to construct 7. We show by induction on n that each 7„ satisfies the properties. 

Base case. Consider 7 and an arbitrary fact 77 G Iq. Each term occurring in H is contained in TV/. Moreover, H is a fact 
from S(/C) and, by definition, it is also a fact from D(/C). Now <5 is the identity over TV/, and J satisfies if, so properties 1 and 
2 hold. Property 3 holds vacuously since 7 does not contain facts with the equality predicate. 

Inductive step. Consider an arbitrary n G N and assume that I n satisfies properties 1-3; we show that the same holds for 
I n +i- Towards this goal, we consider the different clauses in S(/C) U S(/C)~ that can derive fresh facts from 7„. We distinguish 
the following two cases. 

First, consider an arbitrary datalog clause of the form ip — > ip from S(/C) U S(/C)~. Let a be an arbitrary substitution mapping 
variables occurring in the clause to the terms in the Herbrand universe of S(/C) such that cr((p) C I n , so the clause derives 
a(ip) G I n +i- Let a' be the substitution defined such that a'(x) = 5(a(x)) for each variable x occurring in the clause. By 
the inductive hypothesis, we have <j'(f) C J. Furthermore, by the definition of D(/C), we have that D(/C) U D(/C)~ contains 
ip —¥ if). Finally, since J satisfies tp — >• we have cr / (V') G J, as required. 

Second, consider arbitrary clauses from S(/C) of the form — > i?(a;, /^^(a;)) and A\(x) — > A(/fl ;J 4(a;)), and assume 

that Ai(w) G 7„; hence, these clauses derive {-R(tu, fR,A(w)), A(fn j A(w))} C 7„ + i. By the inductive hypothesis, we have 
G J. Furthermore, by the definition of 5, we have that 5( fn t A(w)) = Or,a- Moreover, by the definition of program 
D(/C), the program contains clauses A\(x) — > 0^^) and Ai(a;) — > ^(oij^). Finally, model J satisfies both of these 
clauses, so we have {i?(<5(w), o^^) A(or.a)} C J, as required. □ 

Proof of Lemma 4 

In order to prove Lemma 4, we use the properties from Lemmas 12 and 13. 

Lemma 12. For each term w 2 , each role R G Nr, and each concept A G Nc U {T}, ?/ /ra{w2) G dom(J), f/zen 

/fl,A(«>2)), A(/ fliA (l« 2 ))} C I. 

Proof. Let Jo, • • • be the sequence used to construct 7; we assume w.l.o.g. that each I n +i is obtained from I n by applying 
just one clause type. We show by induction on n that each /„ satisfies the properties. For the base case, set Jo clearly satisfies 
the property since it does not contain functional terms. For the inductive step, assume that some /„ satisfies the property, and 
consider an arbitrary term w 2 , role R, and concept A G Nc U {T}. By the construction of S(/C), there are only two types of 
clauses that may introduce new functional terms in dom(7„ + i). First, such a term may be introduced by clauses of type 6 (see 
Table 1), but then the term clearly satisfies the required property. Second, a clause of the form x w y — > fR t A(%) ~ fn,A(y) 
may be applied w x w w 2 G 7„ and derive /r,a(wi) w fR,A(w 2 ) G 7„ + i. If f^Aiwi) € dom(7„), then set 7„ +i satisfies 
the required property by the induction hypothesis. Otherwise, term Jr i a(w2) occurs in 7„ + i only in equality assertions, so 
Jr,a{w2) dom(7„ + i), and the property holds vacuously. □ 

Let Jo, Ji, ... be the sequence used to construct the minimal Herbrand model J of D(/C). We assume w.l.o.g. that each J n +i 
is obtained from J n by applying a single clause occurring in D(/C), apart from the clause defining the symmetry of w which is 
always applied so as to keep the relation w in J„ symmetric. We next show that each J„ satisfies the following property. 

Lemma 13. For each n G N an<f aZZ ferms «i and u 2 , ifu\ « u 2 € 4 anc/ u 2 G auxj^, f/zen ui = u 2 . 

Proof. We prove the claim by the induction on n. For the base case, J satisfies the property since aux,/ is empty. For the in- 
ductive step, assume that some J„ satisfies the property; we show that the same holds for J n +i- We consider the various clauses 
that may derive an equality in J„+i. The facts derived by a clause of the form A(x) — > x w a vacuously satisfy the property 
since the derived fact involves terms that are not in auxj n+1 . Furthermore, a fact derived in J n +i by applying either the reflex- 
ivity or the symmetry clause satisfies the property by the inductive hypothesis. We are left to consider the transitivity clause. 
Let Mi, u 2 , and u 3 be arbitrary terms such that {ui w u 2 , u 2 rts u 3 } C J n , so the transitivity clause derives U\ w u 3 G J„+i. 
We consider the interesting case in which u 3 G auxj n+1 , so u 3 G auxj^. By the inductive hypothesis, we have u 2 = u 3 ; but 
then, u 2 G auxj n , and so again, by the inductive hypothesis, we have u\ = u 2 ; finally, this implies that u\ = u 3 . □ 

Lemma 4. Let I and J be the minimal Herbrand models o/S(/C) and D(/C), respectively. Mapping 6 satisfies the following 
five properties for all terms W\ and w 2 in dom(7), each B G Nc U {T, _L}, and each R G Nr. 

1. B(S(wi)) G J implies that B(wi) G 7. 



2. R(6(w\), 6(102)) € J and 6(1112) auxj imply that 
R(wi,w 2 ) G 7. 

3. 7?(5(u>i), 5(zo 2 )) € J and 5(w 2 ) G auxj imply that 

5(w 2 ) is of the form op^a, that R(wi, /p,a(^i)) € 7, ««<i ffotf a term exists smc/z f/zaf w 2 ) € 7. 

4. <5(wi) w 5(102) € J and 5(w2) auxj imply that 
wi w w 2 e 7. 

5. For eac/z term u occurring in J, term w G dom(7) exists smc/z f/zaf 5(ro) = M. 

Proo/ Let Jo, Ji, ... be the sequence as stated above. We prove the claim by induction on n. 

Base case. Consider Jo. By definition, S(/C) U S(/C)~ and D(/C) U D(IC)~ contain the same facts, all of which only refer to 
the individuals in N[ and the predicates in N c U N R U {T, _L}. Since 5 is the identity over Nj, auxj is empty and J = 7 , 
so properties 1-5 are satisfied. 

Inductive step. Assume that some J„ satisfies properties 1-5; we show that the same holds for J n +i- To this end, let 10 1 and 
W2 be arbitrary terms in dom(7). We next consider the various clauses in D(fC) U D(JC)~ that may derive fresh assertions in 

Jn+1- 

• A(x) — » B(x). Assume that A(S(wi)) € J„, and so the clause derives B(5(wi)) G J n +i. By the inductive hypothesis, we 
have A(w\) G J Finally, since the same clause occurs in S(/C), we have B(w\) G 7. 

• ^4(x) — > x s=s a. Assume that A(5(i0i)) € J n , and so for 5(u> 2 ) = w 2 — a the clause derives S(w\) w 5(io 2 ) in J„+i. Clearly, 
we have £(102) ^ auxj n+1 . By the inductive hypothesis, we have A(wi) G 7. Finally, since the same clause occurs in S(/C), 
we have wi w w 2 G J 

• Ai(a;) A A2(x) — > A(:r). Assume that Ai(5(t0i)) G J„ and A 2 (5(iOi)) € J n , and so the clause derives A(6(w\)) G J n +i. 
By the inductive hypothesis, we have {Ai(w\), A 2 (wi)} C 7. Since the same clause occurs in S(/C), we have A(wi) G J 

• R(x,y) A — > Assume that R(6(w\), 5(w2)) and Ai(5('u; 2 )) are in contained J„, and so the clause derives 
A(S(wi)) G J n +i. We have the following two cases. 

- 8(11)2) & auxj^. By the inductive hypothesis, we then have {R(w\, w 2 ), ^1(102)} C 7. 

- 6(102) G aux/ and term S(w 2 ) is an auxiliary individual of the form o P a- By the inductive hypothesis, we then have 
{R(wi,fp,A(wi)), Ai(f P , A (wi))} C 7. 

In either case, since the same clause occurs in S(/C), we have A(wi) G 7. 

• R(x, y) — > A(y). Assume that R(6(w\), 6(11)2)) G J n , so the clause derives A(8(w 2 )) G J n +i- We have the following two 
cases. 

- 6(102) & auxj n . By the inductive hypothesis, we then have R(wi,w 2 ) G 7. 

- 6(102) G auxj^. By the inductive hypothesis, then there exists a term w[ such that R(w[, w 2 ) G I. 
In either case, since the same clause occurs in S(/C), we have A(w2) G 7. 

• S(x,y) — > R(x,y). Assume that S(6(wi), 6(102)) G J n , and so the clause derives R(8(wi),8(w 2 )) G J n +i- We have the 
following two cases. 

- 6(11)2) G' auxj n . By the inductive hypothesis, we have that S(w\, W2) G 7. Since the same clause occurs in S(/C), we have 
R(wi,w 2 ) G 7. 

- 6(w 2 ) G auxj n and 5(io 2 ) is an auxiliary individual of the form Op^a- By the inductive hypothesis, then there ex- 
ists a term w' 1 such that {S(w\, /pa(wi)), £(101,102)} C 7. Since the same clause occurs in H(/C), we have that 
{i2(«;i,/p,A(«;i)), i?K,w 2 )} c 7. : 

• Ai(x) — > Ofl ;J 4). Assume that j4i(5(w;i)) G J„, so for 5(w 2 ) = o^a the clause derives R(8(wi),6(w 2 )) in J n +i- By 
the inductive hypothesis, we then have A(w±) G 7. Furthermore, by the definition of D(/C), set S(/C) contains the clause 
A\(x) — > Jr,a(x)), so we have R(wi, /r,a(w>i)) G 7. We have the following cases. 

- 5(w 2 ) ^ auxj n+1 . Thus, we also have <5(u> 2 ) aux,/ n , and so there exists some c G Nj such that <5(w 2 ) w 5(c) G J„ and 
5(c) g" aux,/ n . By the inductive hypothesis, we have w 2 w c G 7. Due to 5(w 2 ) = 5(/p i yi(wi)) and the inductive hypoth- 
esis, we have c w fp.A{wi) G 7. Since w is a congruence relation and {R(w\, Jr,a(w\)), c w fR t A(w\), c rts w 2 } C 7, 
we have R(wi,w 2 ) G 7, as required. By the inductive hypothesis, property 5 is also satisfied. 

- 5(w 2 ) G auxj n+1 . By the definition of 5, term u> 2 is of the form fR,A(w' 2 ), and, by the induction hypothesis, we have 
that Jr^a(w 2 ) G dom(7). By Lemma 12, we have that R(w' 2 , Jr,a{w' 2 )) G 7. As stated above, R(w\, Jr^wi)) G 7, so 
property 3 is satisfied. Moreover, 5(fp_A(wi)) = or.a, and so property 5 is satisfied as well. 

• A\(x) — > ^(ofl^). Assume that A\(6(wi)) G J n , so for 5(u> 2 ) = Oij^ the clause derives ^4(5(u> 2 )) G J n +i- By the defini- 
tion of 5, term w 2 is of the form fR,A(w' 2 )- By Lemma 12 and w 2 G dom(7), we have A(w 2 ) G 7. 



• — > x w a;. Assume that <5(wi) occurs in J„, so the clause derives <5(wi) s=s <5(w 2 ) € J n +i with 6(wi) = S(w2). We con- 
sider the interesting case when 8(102) £ auxj n+1 , and so 8(w 2 ) £ auxj^. Then, an individual c G Nj exists such that 
{S(wi) ~c, ck <5(w 2 )} C J„. By the inductive hypothesis, we have that {wi ~c, ck w 2 } C 7. By the transitivity of 
«, we have wi w w 2 € 7. 

• x\ w x 2 — > x 2 w a?i. Assume that <5(wi) w <5(w 2 ) G J n , so the clause derives <5(u> 2 ) w 8(w\) G J n +i- We consider the in- 
teresting case when 8(w\) auxj n+1 ; clearly, we have S(wi) £ aux,/ n as well. Since predicate w is symmetric in J„, we 
have <5(u> 2 ) s=s € J n . By the inductive hypothesis, we have w 2 w w x G 7. 

• X! w x 3 A x 3 w x 2 — > X! « x 2 . Assume that set J n contains 8(w\) s=s <5(u> 3 ) and ^(^3) w <5(w 2 ), so the clause derives 
5(w\) w <5(w 2 ) G Jn+i- The only interesting case is when 8(w 2 ) G" auxj n+1 ; clearly, then 8(w 2 ) G" auxj n . By Lemma 13, 
then 8(103) auxj n . Finally, by the inductive hypothesis, then \w\ s=s w 3 , w 3 rj w 2 } C 7, which implies w\ w u> 2 G 7. 

• j4(x) A x w y — > j4(j/). Assume that facts A(<5(u>i)) and <5(tui) w <5(w 2 ) are contained in J„, so the clause derives 
A(S(w 2 )) G J n +i. By the inductive hypothesis, we have G I. We consider the following two cases. 

- S(w 2 ) G" auxj n . By the inductive hypothesis, we have W\ w w 2 G I, and so A(w 2 ) G 7. 

- 5(w 2 ) G aux,/ n . By Lemma 13, then 8(w\) = 8(102), so A(S(w 2 )) G J n . Finally, by the inductive hypothesis, we then 
have A(w 2 ) G 7. 

• R(x, y) A x w 2 — ► R(z,y). Assume that set J n contains R(8(u>i), S(w 2 )) and S(wi) rts 8(103), so the clause derives 
R(8(w3), 8(102)) G J n +i- We consider the following two cases. 

- S(w 2 ) £ aux,/^. By the inductive hypothesis, we have R(w\,w 2 ) G 7. We distinguish two additional cases. First, assume 
that 8(103) G auxj n . By Lemma 13, we have 8(w\) = 8(103), an d so R(8(ws), 8(w 2 )) G J„. By the inductive hypothesis, 
then R(w 3 , w 2 ) G 7. Second, assume that 8(103) & aux ./„- By the inductive hypothesis, we have W\ w w 3 G 7, and so we 
have R(v)3, w 2 ) G 7 as well. 

- 5(w 2 ) G auxj n and <5(w 2 ) = op^- By the inductive hypothesis, some w[ exists s.t. fp,A(w\j), R(w' 1 ,io 2 )} Q 7. 
We distinguish two additional cases. First, assume that 8(103) G auxj^. By Lemma 13, we have 8(w\) — 8(11)3), which 
further implies R(8(iV3), 8(102)) G J n . By the inductive hypothesis, then we have {i?(w 3 , /p,a(w 3 )), R(w[, w 2 )} C 7. 
Second, assume that 5(w 3 ) ^ auxj^. By the inductive hypothesis, we have w\ w w 3 G 7. By the functional reflexivity 
clauses, then f R , B (wi) ~ /fl,s(w 3 ) G 7, which again implies {i?(w 3 , f RyB (w 3 )), R(w[,w 2 )} C 7. 

• R(x,y) Ay ~ z ^> R(x,z). Assume that set J„ contains 7?(<5(ti;i), (5(w 2 )) and <5(w 2 ) w 5(w 3 ), so the clause derives 
i?(<5(wi), £)(io 3 )) G J n +i- We consider the following two cases. 

- 8(102) & aux ./„- By Lemma 13, then <5(w 3 ) G" auxj re , and so (5(w 3 ) auxj n+1 as well. By the inductive hypotheses, then 
R(u>i, W2) and w 2 w w 3 are in 7, so R(w 1 , W3) G 7 as well. 

- (5(u> 2 ) G auxj n and 8(102) is of the form op t A- By Lemma 13, then 8(103) = <5(w 2 ), which implies 5(w 3 )) G J n . 
Finally, by the inductive hypothesis, then there exists a term w\ such that {R(wi, R(w[, 103)} C 7. □ 

Proof of Proposition 5 

Proposition 5. Tor /C an arbitrary ECHO^ knowledge base, S(/C) |= 3y._L(y) if and only ifD(JC) |= 3t/._L(y). Furthermore, 
the satisfiability ofJC can be checked in time polynomial in the size ofJC. 

Proof. From Lemmas 3 and 4, we have _L(w) G 7 if and only if _L(<5(w)) G J. Thus, /C is unsatisfiable if and only if individual 
u exists such that D(fC) |= -L(u). Furthermore, to check the latter, we can compute J and check whether an individual u exists 
such that -L(u) ^ J- Since the number of variables occurring in each datalog clause is bounded by a constant, the computation 
of J can be implemented in polynomial time in the size of JC (Dantsin et al. 2001). □ 

Proof of Theorem 10 

We first show that the minimal Herbrand model 7 of E(JC) resembles a forest structure. Let 7 ,7i, ... be the sets used to 
generate 7; for simplicity, in the rest of this section we assume w.l.o.g. that the clauses are applied in a way so that relation ps 
is symmetric in each I n . Furthermore, for each term w, we define the size ofw as follows. 



Lemma 14. Interpretation I satisfies the following three properties for all terms W\, w[, w 2 , and w 2 , all roles R, S, and T, 
and each concept A G Nc U {T}. 




Finally, we define the depth ofw in I as follows. 




if w £ auxj, 

1 + d(w',I) if 10 G auxj and 10 = Jt,a(w')- 



PL R{w[, frAwi)) G /, S(w' 2 ,f T . A {w 2 )) G /, 

/t,a(»i) ~ fT,A{w 2 ) G 7 and f T ,A{w 2 ) G aux 7 j'mp/y wi « w 2 G 7. 
P2. wi w u> 2 G 7 implies d(wi, I) — d(w 2 , 1). 

P3. R(w[, Jt,a(wi)) G 7 anc/ /t,a(^i) G aux/ /m/?fy f/zaf d(/T,^(wi), 7) = 1 + d(u/ 1; 7). 

Proof. To prove properties P1-P3, we first show by induction on n that each 7„ satisfies the following two auxiliary properties 
for all terms «/, to, toi, and w 2 , all roles i?, T, and T', and all concepts ^4 and A' in Nc U {T}. 

Al. fT>,A'{w2) G aux/ n and /t,a(wi) ~ /t',A'(?« 2 ) G 7„ imply that T = T", A = A', and wi w u> 2 G 7. 
A2. fr,A(w) G aux/ n and 77(w', /t,a(w)) G 7„ imply that a term to" exists such that J contains T(w", Jt,a{w")), w' w to", 
and !t,a{w) ~ JtM w ")- 

Base case. By definition, Jo does not contain functional terms, so properties Al and A2 are vacuously true. 

Inductive step. Assume that 7„ satisfies properties Al and A2; we show that the same holds for 7 n+1 by considering in turn 
the various clauses that may introduce fresh assertions into I n +i- We consider only the interesting cases in where an equality or 
a binary assertion is derived, since all other clauses trivially preserve Al and A2. Let w', w, wi, w[, and w 2 be arbitrary terms, 
let R, T, and T" be arbitrary roles, and let A and A' be arbitrary concepts in Nc U {T}. 

• Ai(x) — > x w a. Assume that Ai(wi) G 7„, so the clause derives wi«ae 7„ + i. Since a a uxj n+1 , properties Al and A2 
are preserved. 

• — > a; w a;. Assume that fT,A(wi) occurs in 7„, so the clause derives fr,A(wi) ~ /t.a(w^i) € 7 n+ i; the interesting case is 
when fT,A{w\) G aux/ n+1 . Since /t,a(wi) occurs in 7„, then wi occurs in the Herbrand universe of S(/C). By reflexivity, 
then t«i w u)i € 7, as required for Al. Furthermore, this derivation clearly preserves A2. 

• x w y — > fT.A(x) Ri Jt,a{v)- Assume toi ~ w 2 G 7„, so the clause derives /t.a(wi) ~ /t.a(w 2 ) € 7„ + i; the interesting 
case is when /t.a(w 2 ) € aux/ n+1 . By assumption, Wi rj w 2 € 7„, and so w u> 2 € 7, as required for property Al. Fur- 
thermore, this derivation clearly preserves A2. 

• x\ Rj x 2 — > x 2 rj xi. Assume that fT',A'{wi) ~ /t,a(wi) G I n , so the clause derives /t.a(w>i) rj /t',A'(w2) € 7„ + i; the 
interesting case is when /t',A'(w 2 ) € aux/ n+1 , which clearly implies /t',a'(w 2 ) G aux/ re . Since relation rj is symmet- 
ric in I n , we have /t,a(wi) ~ fT',A'(w2) € 7„; but then, by the inductive hypothesis, we have T = T", A = A', and 
Wi w w 2 G 7, as required for property Al. Furthermore, this derivation clearly preserves A2. 

• x\ w x 2 A x 2 Rj x 3 — > xi « x 3 . Assume that 7„ contains /t,a(^i) ~ /t'.A'^i) an d /t'.A'^i) ~ /t",A"(w 2 ), so the 
clause derives /t,a(wi) « /t",A"(w>2) G 7„ +i ; the interesting case is when fT",A"{w 2 ) G aux/ n+1 , which clearly implies 
It",A" (^2) G aux /„- Clearly, we then also have /t'.a'( w i) € aux/^. By the inductive hypothesis, we have T = T' = T", 
A = A' = A", and {w\ ~ w[, w[ rts «; 2 } C 7. Thus, we have wi « u> 2 G 7, as required for property Al. Furthermore, this 
derivation clearly preserves A2. 

• Ai(x) — » T(x, fr,A{x)). Assume that A\(w') G 7„, so the clause derives T(w', fr,A{w')) G I n +i\ the interesting case is 
when Jt,a{w') G aux/ n+1 and w' = w. Then, for w" = w = w', we have 

{T(w",f T ,A{w")), w" Ri w", f TiA (w") w fr,A(w")} C 7, 

as required for property A2. Furthermore, this derivation clearly preserves Al. 

• P(x, y) — > i?(x, y). Assume that P(w', Jt,a{w)) G 7„, so the clause derives 77(u/, Jt,a(w)) G 7„+i; the interesting case 
is when /t,a(w) G aux/ n+1 , which implies /t,a(w) G aux/ n . By the inductive hypothesis, then a term w" exists such that 
{T(w" , /t,a(w"))) u '' ~ It,a{w) ~ /t,a(^")} ^ ^> as required for property A2. Furthermore, this derivation clearly 
preserves Al. 

• i?(x, y) A x Ri z — > y). Assume that {R(w[, fT. A (w)), w[ rj w'} C 7„, so the clause derives R(w' , Jt.a{w)) G 7„ + i; 
the interesting case is when Jt,a(w) G aux /n+1 , which implies fr,A{w) G aux /ri . By the inductive hypothesis, a term w" 
exists such that {T(w", f T ,A(w")), w[ « w", /t,a{w) « /t,a(w")} C 7. By the transitivity of we have w' Ri ty" G 7, 
as required for property A2. Furthermore, this derivation clearly preserves Al. 

• R(x, y) A y ~ z — > R(x, z). Assume that {R(w' , /t,a(^i)), /t,a(^i) ~ /t,a(w)} C I n , and so the clause derives the fact 
R(w' , /t,a{w)) G 7„ + i; the interesting case is when fr,A(w) G auxj n+1 , which implies Jt,a{w) G auxj n . Then, clearly 
It,a(wi) G aux/ n . By the inductive hypothesis, a term w" exists such that 

{T(w",f T . A (w")), w' w /t,aK) w /t,a(«/')} C 7. 

By the transitivity of w, then fr,A(w) w /t,a(w") G 7, as required for property A2. Furthermore, this derivation clearly 
preserves Al. 



We are now ready to show properties P1-P3. 



PROPERTY PI. Let w[, w\, w 2 , w 2 be arbitrary terms, let R, S, and T be arbitrary roles, and let A be an arbitrary concept 
in N c U {T}. Assume that {R{w[, Jt,a{wi)), S{w 2 , Jt,a{w 2 )), /t,a(«>i) w !t,a{w 2 )} C 7 and Jt.a{w 2 ) G aux/. By ap- 
plying property A2 to R(w[, /t,a(wi)) and S(w' 2 , /t,a(^2)), we have that two terms and w 2 exist such that 

{T«, /t,a«)), t«i « < /t,aK) « /t,aK)} C /, and 

{t«, / T iK)), ™ 2 « < / T ,^ 2 ) « /wK)} c /. 

By the transitivity of w, we have that /t,a(u>i) w fr,A{w 2 ) € 7, and so by Property Al, we conclude that w" w iu 2 ' e 7. 
Finally, since {io^ w w", w'{ w w 2 , w 2 w w 2 } C I, by the transitivity of w, we get w[ rts w' 2 £ /, as required. 

Property P2. We show by induction onneN that, for all terms w\ and w 2 such that \w\ \ < \w 2 \ < if w\ ~ w 2 e /, 
then d(wi, /) = d(w 2 , /). 

Base case. Let u>i and w 2 be arbitrary terms such that \wi\ — \w 2 \ = and Wi w w 2 e J. By the definition of |-|, then 
{wi,w 2 } C AT/, sod(ioi,/) = d(io 2 ,/) = 0. 

Inductive step. Consider an arbitrary n £ N and assume that the required property holds for all terms w[ and w 2 such that 
l^il < l w 2 | < n ; we show that the same holds for arbitrary terms w\ and w 2 such that \wi\ < \w 2 \ < n + 1. We consider 
the interesting case when toi w w 2 € /, for which we consider two cases. First, if w 2 ^ auxj, then d(iOi, /) = d(w 2 , /) = 0. 
Second, if w 2 £ aux/, then by property Al there exist two terms w[ and w' 2 , a role T, and a concept A £ Nc U {T} such 
that u>! is of the form /t,a{w[), term w 2 is of the form fT,A{w 2 ), and w[ ~ w' 2 £ I. By the inductive hypothesis, then 
d(w' 1 , 1) = d{w' 2 , 1). Finally, by definition, we have d(wi, I) = d{w 2 , I) = 1 + d{w' 2 , I), as required. 

PROPERTY P3. Let w[ and Wi be arbitrary terms, let R and T be arbitrary roles, let A £ N c U {T} be an arbitrary con- 
cept, and assume that R{w[, Jt a{wi)) £ I and Jt a{w\) £ auxj. By property A2, then there exists a term w'{ such that 
{TK,/ t ,aK))» < « /t,a(»i) « /t,aK)} C /. By the definition of d(-), then df/^K),/) = l + d{w'{,I). 
Furthermore, by property P2, then d(f T ^ J i L (wi), I) = cI(/t,a(w"), /) and d(w' 1 , I) = d(w'{, I). Finally, these observations im- 
ply that d(/T,A(wi), I) = 1 + d{w[,I), as required. □ 

We now have all the ingredients required to prove Theorem 10. We start by showing completeness. 

Lemma 15 (Completeness). Let JC = (T, A) be a satisfiable £CHO± KB, let q = 3y.ip{x, y) be a CQ, and let tt : x M> Nj be a 
candidate answer for q. Then, S(/C) |= ir(q) implies that a substitution r exists such dom(r) — x U y, r| - = n, D(JC) |= r(q), 
and isSpur(g, D(/C), r) = f. 

Proof. Let / and J be the minimal Herbrand models of S(/C) and D(/C), respectively. Since S(/C) |= Tr(q), a substitution tt* 
exists such that dom(7r*) = x U y, ir*\ - = tt, and tt* (q) C /. Let 5 be the mapping from I to J defined in the section about the 
datalog rewriting of JC. We define r as the substitution such that, for each term t £ N^iq), we have r(t) := 5(n*(t)). Finally, 
let ~ be the relation for r, q, and D(JC) as specified in Definition 7. Since S is a homomorphism from / to J by Lemma 3, we 
have J |= r(q). We next prove isSpur(<7, r, D(/C)) = f by showing that all conditions of Definition 9 are satisfied. 

(Condition a) By the definition of r, for each x £ x, we have r(a;) £ Nj. 

(Condition b) We prove that, for each s <~ t, we have r(s) w r(t) e J and tt*(s) w 7r*(t) € /. We proceed by induction 
on the number of steps required to derive s <~ i. For the base case, the empty relation <~ clearly satisfies the two proper- 
ties. For the inductive step, consider an arbitrary relation ~ obtained in n steps that satisfies these constraints; we show that 
the same holds for all constraints derivable from ~. Since relation w in both J and I is reflexive, symmetric, and transi- 
tive, the derivation of s <~ t due to reflexivity, symmetry, or transitivity clearly preserves the required properties; thus, we 
focus on the (fork) rule. Let s', s, t', and t be arbitrary terms in N^iq), and let R and P be arbitrary roles such that 
s' ~ t' is obtained in n steps, atoms R(s,s') and P(t,t') occur in g, and r(s') € auxp^). By the inductive hypothesis, we 
have t(s') w r(t') £ J and 7r*(s') w it* {if) £ I. Since J is the minimal Herbrand model of D(/C), we have r(t') £ auxj, 
so no individual c £ Nj exists such that r(t') w c G J. By Lemmas 3 and 4, r(t') ^ auxj if and only if 7r*(t') ^ aux/; 
hence, it* {if) £ aux/. Since {i?(7r*(s),7r*(s')), P{ir*{t),TT*{if)), tt*{s') w 7r*(i')} C / and 7r*(t') G aux/, by property PI 
of Lemma 14 we have 7r*(s) w TT*{t) £ I. Finally, since 5 is a homomorphism (see Lemma 3), by the construction of r we 
have t(s) w r(t) e J, as required. 

(Condition c) To show that g is aux-acyclic w.r.t. r and D(/C), we assume the opposite; hence, there exists a sequence 
of vertices v n , . . . ,v m £ V 3UX such that m > 0, for each < i < m we have {vi, v i+1 ) £ E 3UX , and v m = Vq. Consider an 
arbitrary i < m and the corresponding edge (uj, Uj+i) G -Eaux- By the definition of E 3UX , an atom Ri(si, Sj+i) exists in g such 
that 7(sj) = and j(si + i) = Vi+i; hence, we have Si <~ Uj and Sj + i ~ Since r satisfies all the constraints in <~, by 
the definition of G aux we have that {r(s i ), r(s i+1 )} C aux D ( K ). By Lemmas 3 and 4, then {7r*(s i ), 7r*(s i+1 )} C aux/ as 
well. In addition, since Ri{si,s i+ i) is an atom in q, we have i? i (7r*(s i ),7r*(s i+ i)) e /. Also, since s» ~ Uj, Sj+i ~ and 



substitution 7r* satisfies the constraints in <~, we have {n*(si) w ir*(vi), 7r*(s i+ i) w 7r*(i; i+ i)} C J. Then, by property P2 of 
Lemma 14, we have 

d(7r*(s<),7) = d(7r*(v i ),i) and 
d(7r*(« i+ i),J) = d(7r*(i; j+ i),i). 

Finally, since R(ir*(si),n*(si + i)) £ I, by property P3 of Lemma 14 we have d(7r*(uj+i), I) = 1 + d(n*(vi),I). But then, 
since v m = v , we also have d(n*(v m ),I) = d(n*(vo),I), which is a contradiction. □ 

We are left to prove the soundness of our approach. Let r be an arbitrary substitution for q w.r.t. D(JC) such that J |= r(q) 
and isSpur(g, r, D(/C)) = f. Furthermore, let G aux be the graph as specified in Definition 8. We next show that I \= t\ s (q). In 
order to do so, we first define the graph G q of the query q. 

Definition 16. Let 7 and V 3UX be as in Definition 8. The query graph G q = (V q , E q ) is the directed graph defined as follows. 

• V q is the smallest set containing j(t) for each t £ Nx(q). 

• E q is the smallest set containing (j(s),-f(t)} for all terms {s, t} C Nx(q) such that query q contains R(s, t) for some R. 

Vertex v £ V q is a root if v S' V 3UX or, for each vertex v' £ V q , we have (v 1 , v) £ E q . 

Clearly, by the definition, G aux is a subgraph G q . We prove the soundness claim in three steps. First, we show that the graph 
G q is a forest. Second, we define by structural induction on the forest G q a substitution it for q w.r.t. S(/C) such that t\ s — tt\ s . 
Third, we prove that I |= ir(q) holds. 

Lemma 17. 7f isSpur(g, r, D(/C)) = f, then G q is a forest. 

Proof. Due to isSpur(g, r, D(/C)) = f, we have that G aux is a direct acyclic graph. Consider an arbitrary vertex v £ V 3UX and 
arbitrary vertices Vi,v 2 £ V q such that {(v\, v), (v2,v)} C E q ; we next show that vi = v 2 . By the definition of G q , we have 
that {s, s', t, t'} C N T (q), and that roles R and P exist such that all of the following conditions are satisfied: 

• atoms R(s, s') and P(t, t') are in q; 

• l( s ') = v = -f(t'), 7(s) = Vi, and j(t) — v 2 ; and, 

• {r(s'),r(t')} C aux, 7 . 

Due to the (fork) rule, we have s <~ t. By the definition of 7, we have 7(s) = 7(t), which implies v\ = v 2 , as required. □ 

By structural induction on the forest-shaped graph G q , we next define the substitution n as follows; we will later show that 

3(JC)|=7r(g). 

• For the base case, let v be an arbitrary root of G q . For each term t £ Nxiq) such that -f(t) = v, we define ir(t) as an arbitrary 
term uu £ dom(7) such that S(w) = r(t). 

• For the inductive step, let v be an arbitrary vertex of G q such that v £ V aux , term t(v) is of the form or : a, the value of tt(v) is 
undefined, v' is the unique vertex of G q such that (v 1 , v) £ E q , and ir(v') has already been defined. For each term t £ N T (q) 
such that j(t) = v, we define Tr(t) := fR,A{^(v'))- 

Lemma 18. Substitution ir satisfies the two following properties for each term v £ V q and all terms s,t £ Nx(q) such that 
7(s) = v = j(t): 

Ml. S(ir(s)) = t(s), and 
M2. 7r(s) w n(t) £ I. 

Proof. We prove properties Ml and M2 by the structural induction on the forest G q . 

Base case. Let v be an arbitrary root of G q , and let s, t £ Nx{q) be arbitrary terms such that j(s) = v = "f(t). Property Ml 
follows from the fact that n(s) £ {w £ dom(7) | 6(w) — t(s)}. We next prove property M2. By the definition of 7, we have 
that s ~ t. Since isSpur(g, r, D(/C)) = f, we have r(s) w r(t) £ J. We have the following two cases. 

• Assume that v £ V 3UX . Clearly, {r(s), r(t)} C aux,/. By the construction of J, there exists neN such that r(s) w r(i) e J n 
and r(t) e auxj re . By Lemma 13, we have r(s) = r(t). Thus, 7r(s) = w(t) and 7r(s) w n(t) £ I, as required. 

• Assume that v £ V 3UX . Then, we have r(t) £ auxj and, by Lemma 4, we have n(s) w 7r(t) e /. 

Inductive step. Let t; e Va UX be an arbitrary vertex, let s,t€ Nx(q) be arbitrary terms such that j(s) = v = -){t), and 
assume that t(v) is of the form or^a- By the definition of 7, we have that s ~ t. Since isSpur(g,r, D(/C)) = f, we have 
t(s) w r(t) G J. Since 11 <E Va UX , we have {t(v),t(s), r(t)} C auxj. Then, by the construction of J, some n e N exists such 
that {t(v) Rj r(s), r(s) « C J„ and {r(s), r(t)} C auxj n . By Lemma 13, we have t(v) = r(s) and r(s) = r(i). Now 
let v' be the unique vertex of G q such that (v 1 , v) £ E. By definition, w(s) — fR t A{^{v')) = 7r(i). Also, by the definition of 6, 
we have 5(fn : A('K(v'))) — or^a, so property Ml holds. By the reflexivity of w, we have 7r(s) rs n(t) £ I, and so property M2 
holds, as required. □ 



We finally prove the soundness of our approach. 

Lemma 19 (Soundness). Let I and J be the minimal Herbrand models o/H(/C) and D(/C), respectively; let q = 3y.i[)(x, y) be 
an arbitrary CQ; and let r be an arbitrary substitution such that r(q) C J and isSpur(g, r, D(JC)) — f. Then, t\ s (q) G /. 

Proof. For q and r as specified in the lemma, let 7r be the substitution defined as specified just before Lemma 18, and assume 
that isSpur(g, D(/C), r) = f. By definition, we have 7rL = tL. We next show that ir(q) C J. 

First, let be an arbitrary unary atom of q, we show that A(n(t)) G /. By assumption, we have A(r(i)) € J.ByLemma4, 
for each term w G dom(7) such that 5(w) = r(t), we have that A(w) G J. By property Ml of Lemma 18, we have A(n(t)) G /. 

Second, let R(t' , t) be an arbitrary atom of q; we show that R(ir(t'), ir(t)) G /. By assumption, we have i?(r(i'), r(i)) G J. 
We distinguish the following two cases. 

1. Assume that r{t) £ auxj. By Lemma 4, for all terms w' , w G dom(J) such that 5(w') — r(t') and 8{w) = r(t), we have 
R(w', w) G I. By property Ml of Lemma 18, we have R(n(l/),n(i)) G /. 

2. Assume that r(t) G auxj, and assume that r(i) is of the form or,a- Furthermore, let v' be the unique vertex of G q such that 
(v 1 , 7(t)) G E q and 7(t') = f '. By the definition of n, we have ir(t) = fR t A( 7T ( v '))- Since isSpur(g, D(/C), r) = f, we have 
r(w') w r(t') G J. Since w is a congruence relation, we have R(t(v'), r(t)) G J. By Lemma 4, for each term w' G dom(7) 
such that 5(w') = t(v'), we have R(w', fR,A{w')) G /. By property Ml of Lemma 18, we have R(tt(v'), fR,A(^(v'))) G /, 
and by Property M2 of Lemma 18, we have n(t') w ir(v') G /. Therefore, we have R(n(t'), fR,A(n{v'))) G I. □ 

Proof of Theorem 11 

Theorem 11. Deciding whether JC \= n(q) holds can be implemented in nondeterministic polynomial time w.r.t. the size of K, 
and q, and in polynomial time w.r.t. the size of A. 

Proof. First, we argue that we can compute relation <~ in polynomial time. For each term u, we can decide whether u G auxD(K) 
by checking whether, for each term u', we have that D(JC) \= u w v! implies u' ^ Nj. Since the number of variables occurring 
in each clause in D(X) is bounded by a constant, this check can be performed in polynomial time. Thus, we can evaluate in 
polynomial time the precondition of the (fork) rule. In addition, the size of relation <~ is bounded by \N T (q)\ 2 , the rules used 
to compute it are monotonic, and each inference can be applied in polynomial time, so we can compute <~ in polynomial time. 

Second, we show that we can decide whether q is aux-cyclic w.r.t. r in polynomial time. Since <~ can be computed in 
polynomial time and the size of G aux is polynomially bounded by the number of terms occurring in q, we can compute G aux in 
polynomial time. Also, we can check whether G 3UX is a acyclic by searching for a topological ordering of its vertexes in linear 
time (Cormen et al. 2009). 

For the NP upper bound, according to Theorem 10 checking whether K, \— n(q) amounts to guessing a candidate answer r 
for q in the minimal Herbrand model of D(/C) such that t\ s = tt and to checking that isSpur(<7, D(/C), r) = f. Since each clause 
in D(/C) has a bounded number of variables, the minimal Herbrand model of D(JC) can be computed in polynomial time. By 
the first two observations, we conclude that the whole process can be carried out in nondeterministic polynomial time in the 
combined size of D(/C) and q. 

For the Ptime upper bound, consider a fixed £CHO r ± TBox T and a fixed conjunctive query q. For an arbitrary ABox A, 
we can enumerate in polynomial time all possible answers to q in the minimal Herbrand model of D(T) U A. Also, we can 
filter out those answers that are spurious in polynomial time. Finally, we just check whether 7r occurs in the remaining (certain) 
answers. □ 



